In software engineering, a spinlock is a lock where the thread simply waits in a loop ("spins") repeatedly checking until the lock becomes available. Since the thread remains active but isn't performing a useful task, the use of such a lock is a kind of busy waiting. Once acquired, spinlocks will usually be held until they are explicitly released, although in some implementations they may be automatically released if the thread being waited on (that which holds the lock) blocks, or "goes to sleep".
Spinlocks are efficient if threads are only likely to be blocked for a short period, as they avoid overhead from operating system process re-scheduling or context switching. For this reason, spinlocks are often used inside operating system kernels. However, spinlocks become wasteful if held for longer durations, preventing other threads from running and requiring re-scheduling. The longer a lock is held by a thread, the greater the risk that it will be interrupted by the OS scheduler while holding the lock. If this happens, other threads will be left "spinning" (repeatedly trying to acquire the lock), while the thread holding the lock is not making progress towards releasing it. The result is an indefinite postponement until the thread holding the lock can finish and release it. This is especially true on a single-processor system, where each waiting thread of the same priority is likely to waste its quantum (allocated time where a thread can run) spinning until the thread that holds the lock is finally finished.
Implementing spin locks correctly is difficult because one must take into account the possibility of simultaneous access to the lock to prevent race conditions. Generally this is only possible with special assembly language instructions, such as atomic test-and-set operations, and cannot be easily implemented in high-level programming languages or those languages which don't support truly atomic operations.[1] On architectures without such operations, or if high-level language implementation is required, a non-atomic locking algorithm may be used, e.g. Peterson's algorithm. But note that such an implementation may require more memory than a spinlock, be slower to allow progress after unlocking, and may not be implementable in a high-level language if out-of-order execution is allowed.
Contents |
The following example uses x86 assembly language to implement a spinlock. It will work on any Intel 80386 compatible processor.
; Intel syntax lock: ; The lock variable. 1 = locked, 0 = unlocked. dd 0 spin_lock: mov eax, 1 ; Set the EAX register to 1. xchg eax, [lock] ; Atomically swap the EAX register with ; the lock variable. ; This will always store 1 to the lock, leaving ; previous value in the EAX register. test eax, eax ; Test EAX with itself. Among other things, this will ; set the processor's Zero Flag if EAX is 0. ; If EAX is 0, then the lock was unlocked and ; we just locked it. ; Otherwise, EAX is 1 and we didn't acquire the lock. jnz spin_lock ; Jump back to the XCHG instruction if the Zero Flag is ; not set; the lock was previously locked, and so ; we need to spin until it becomes unlocked. ret ; The lock has been acquired, return to the calling ; function. spin_unlock: mov eax, 0 ; Set the EAX register to 0. xchg eax, [lock] ; Atomically swap the EAX register with ; the lock variable. ret ; The lock has been released.
The above is a simple implementation which is easy to understand (for a programmer who understands x86 assembly language) and works on all x86 architecture CPUs. However a number of performance optimizations are possible:
On later implementations of the x86 architecture, spin_unlock can safely use an unlocked MOV instead of the locked XCHG, which is much faster. This is due to subtle memory ordering rules which support this, even though MOV isn't a full memory barrier. However some processors (some Cyrix processors, some revisions of the Intel Pentium Pro (due to bugs), and earlier Pentium and i486 SMP systems) will do the wrong thing and data protected by the lock could be corrupted. On most non-x86 architectures, explicit memory barrier instructions or atomic instructions (like in the example) must be used, or there may be special "unlock" instructions (as on IA-64) which provide the needed memory ordering.
To reduce inter-CPU bus traffic, when the lock is not acquired, the code should loop reading without trying to write anything, until it reads a changed value. Because of MESI caching protocols, this causes the cache line for the lock to become "Shared"; then there is remarkably no bus traffic while a CPU is waiting for the lock. This optimization is effective on all CPU architectures that have a cache per CPU, because MESI is so widespread.
The primary disadvantage of a spinlock is that it wastes time while waiting to acquire the lock that might be productively spent elsewhere. There are two alternatives that avoid this:
Most operating systems (including Solaris, Mac OS X and FreeBSD) use a hybrid approach called "adaptive mutex". The idea is to use a spinlock when trying to access a resource locked by a currently-running thread, but to sleep if the thread is not currently running. (The latter is always the case on single-processor systems.)[2]
In a military context, the term "spin lock" can be used to refer to a mechanism within a munition's fuze which arms it upon firing. Implemented on gun-launched ammunition, the rotation imparted on the projectile causes the lock to disengage from a "safe" condition to an "armed" one.